This data set contains transcribed high-quality audio of Catalan sentences
recorded by volunteers. The data set consists of wave files, and a TSV file
(line_index.tsv). The file line_index.tsv contains a anonymized FileID and the
transcription of audio in the file.
<p>
The data set has been manually quality checked, but there might still be errors.
<p>
Please report any issues in the following issue tracker on GitHub.
<a href="https://github.com/googlei18n/language-resources/issues">
  https://github.com/googlei18n/language-resources/issues
</a>
<p>
See LICENSE file for license information.
<p>
Copyright 2018, 2019 Google, Inc.
<p>
If you use this data in publications, please cite it as follows:
<pre>
  @inproceedings{kjartansson-etal-2020-open,
    title = {{Open-Source High Quality Speech Datasets for Basque, Catalan and Galician}},
    author = {Kjartansson, Oddur and Gutkin, Alexander and Butryna, Alena and Demirsahin, Isin and Rivera, Clara},
    booktitle = {Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages (SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL)},
    year = {2020},
    pages = {21--27},
    month = may,
    address = {Marseille, France},
    publisher = {European Language Resources association (ELRA)},
    url = {https://www.aclweb.org/anthology/2020.sltu-1.3},
    ISBN = {979-10-95546-35-1},
  }
</pre>
